-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: split allow_list_answers column in analytics_summary view #2926
feat: split allow_list_answers column in analytics_summary view #2926
Conversation
642d69e
to
c30abd2
Compare
c30abd2
to
e375827
Compare
) as time_spent_on_analytics_session_minutes, | ||
node_id, | ||
al.allow_list_answers as allow_list_answers, | ||
allow_list_answer_elements->>'proposal.projectType' AS proposal_project_type, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was a little unsure about the format of the new column names.
I went for camelcase snake_case
as I thought it was more consistent with the rest of the table but I'm happy to change to match the exact variable name if preferred?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think keeping these as snake_case
is correct as it's for a table / view. Metabase should automatically convert this to Proposal Project Type
I think 👍
Removed vultr server and associated DNS entries |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fantastic - thanks for picking this up.
A few comments raised by PR -
[null]
values
This should have been handled by this line https://github.com/theopensystemslab/planx-new/blob/main/editor.planx.uk/src/pages/FlowEditor/lib/analyticsProvider.tsx#L535 but maybe it's not working as intended?
{ key: value }
vs{ key: [value] }
Yep this is also a bit of an awkward one - values in arrays can contain multiples (e.g. project type) and strings are a single value (e.g. draw boundary option). I'm not sure how Metabase will handle these? We could concatenate arrays into a string (and sort them so they're consistent) but I would think an array is more usable? Not too sure here - maybe a question for @zz-hh-aa . We can format it however we want is the truth 👍
node_fn
Agreed - this column is probably redundant now we have allow list answers. I vote we remove it.
- Align the data structure for the allow_list between analytics_logs and lowcal_sessions i.e. [{key: [value]}], vs {key: [value]} respectively.
Yep - we should do this!
/** | ||
* If appending to ALLOW_LIST please also update the `analytics_summary` view to | ||
* extract it into it's own column. | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perfect - I had this exact comment in mind also!
) as time_spent_on_analytics_session_minutes, | ||
node_id, | ||
al.allow_list_answers as allow_list_answers, | ||
allow_list_answer_elements->>'proposal.projectType' AS proposal_project_type, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think keeping these as snake_case
is correct as it's for a table / view. Metabase should automatically convert this to Proposal Project Type
I think 👍
) / 60 as numeric (10, 1) | ||
) as time_spent_on_analytics_session_minutes, | ||
node_id, | ||
al.allow_list_answers as allow_list_answers, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure we need to keep the current column - what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it adds much value beyond being a reference to check that data is being extracted properly etc.
I'm tempted to remove it 👀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Counter to my previous point I think it could also be handy to check whether a new value in ALLOW_LIST
has been missed in the view 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep this is a good enough reason for me to keep it - it can just not be used in Metabase directly 👍
left join analytics_logs al on a.id = al.analytics_id | ||
left join flows f on a.flow_id = f.id | ||
left join teams t on t.id = f.team_id | ||
left join lateral jsonb_array_elements(al.allow_list_answers) AS allow_list_answer_elements ON true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice 👌
Thanks for the thorough review and responses 🙌 @DafyddLlyr |
Yeah you're right! I misread the updated code and missed this. It's an odd one but I double checked and these are new records so it's not a legacy issue it seems to be live in production at the moment. Also it's not specific to one This might be worth splitting into it's own PR and including a data cleanup migration as I imagine the |
I agree an array is probably more useful and Metabase might handle it gracefully 🤞 I incorrectly thought this was something to do with how we were writing the values to the logs rather than it being the data itself as below. Taking that into account I reckon we maybe go ahead and see how we get on unless you have any objections @zz-hh-aa |
Nice one! I think I'll update both of these in a follow up PR as I think they're close enough logically in that the As part of it I think I'll also ditch the That will also have to change how the db view is written 😅 Although the column names shouldn't change so I think it's still worth merging this to unblock but in the background change how we store the data 😌 |
If I've understood correctly, I would vote in favour of strings--the Metabase GUI gets confused by JSONs but can handle strings, so it might be council-friendlier to go that route. This is a loosely held opinion! If we go the strings route and later wanted to split the array items up, we could always separate by |
Good point @zz-hh-aa the change here would end up with the same data structure as I'll have a look at updating this PR to make the column values a string separated by |
I tried updating this to turn the array into a string separated by columns with the view and it's surprisingly difficult 😬 I think it's because it's trying to do a lot of transformations at once and it's a bit of a nightmare. I'd propose we make this change then handle it on Metabase with a "Saved Question" written in SQL which should hopefully be able to handle this? |
Yep that's a perfect workaround--thanks for looking into it! |
I think the issue was that |
What
ALLOW_LIST
variablescreated_at
for theanalytics_log
rather than just thecreated_at
for theanalytics
sessionWhy
allow_list_answers
with the new data structure was difficult to work with on MetabaseFollow Up
Noticed that the
allow_list_answer
data structure is inconsistent i.e.{key: value}
rather than{key: [value]}
Noticed that the
null
values are no longer being filtered out:planx-new/editor.planx.uk/src/pages/FlowEditor/lib/analyticsProvider.tsx
Line 467 in 612fa87
[null]
surprisingly often?Check that the value of allow list as an array is usable on Metabase:
Remove the
nodeFn
column fromanalytics_logs
as it's not reliable with the previous changes to the way we gather allow_list answers i.e. it naively stores thenodeToTrack?.data?.fn || nodeToTrack?.data?.val
and missed when the passport variable is defined by the breadcrumbs.Align the data structure for the allow_list between
analytics_logs
andlowcal_sessions
i.e.[{key: [value]}]
, vs{key: [value]}
respectively.